Atty Dkt No.: ROC920030293US1 
WHAT IS CLAIMED IS: 

1 . A method for reducing latencies associated with accessing memory for more 
than one processors, each coupled with an associated private cache, the method 
comprising: 

determining cache miss rates of the more than one processors when issuing 

cache requests against one or more private caches; 

comparing the cache miss rates of the more than one processors; and 
allocating cache lines from more than one of the private caches to a processor of 

the more than one processors based upon the difference between the cache miss rate 

for the processor and the cache miss rates of other processors. 

2. The method of claim 1 , wherein determining the cache miss rates comprises 
counting cache misses of each of the more than one processors. 

3. The method of claim 1 , wherein allocating cache lines comprises forwarding 
cache requests from the processor to a private cache associated with another 
processor. 

4. The method of claim 1 , wherein allocating cache lines comprises selectively 
allocating cache lines based upon a priority associated with a cache request of the 
processor. 

5. A method for reducing cache miss rates for more than one processors, wherein 
the more than one processors couple with private caches, the method comprising: 

monitoring the cache miss rates of the more than one processors; 

comparing the cache miss rates of the more than one processors to determine 
when a cache miss rate of a first processor associated with a first private cache of the 
private caches exceeds a threshold cache miss rate for the more than one processors; 
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forwarding a cache request associated with the first processor to a second 
private cache of the private caches in response to determining the cache miss rate 
exceeds the threshold cache miss rate; 

replacing a cache line in the second private cache with a memory line received 
in response to the cache request; and 

accessing the cache line in response to an instruction from the first processor. 

6. The method of claim 5, wherein monitoring the cache miss rates comprises 
counting cache misses after a cold start, warm-up period. 

7. The method of claim 5, wherein comparing the cache miss rates comprises 
comparing the cache miss rates, the cache miss rates being associated with more than 
one processor modules. 

8. The method of claim 5, wherein the threshold cache miss rate is based upon an 
average cache miss rate for the more than one processors. 

9. The method of claim 5, wherein forwarding the cache request comprises 
selecting the second private cache based upon a least recently used cache line 
associated with the private caches. 

10. The method of claim 9, wherein selecting the second private cache comprises 
selecting a least recently used cache line based upon a processor module on which the 
first processor resides. 

1 1 . The method of claim 5, wherein forwarding the cache request comprises 
selecting the cache request based upon a priority associated with the cache request. 



23 



AttyDktNo.: ROC920030293US1 

12. The method of claim 5, wherein forwarding the cache request is responsive to a 
software instruction that overrides a result of comparing the cache miss rates to forward 
the cache request to the second private cache. 

13. An apparatus for reducing cache miss rates for more than one processors, 
wherein the more than one processors couple with private caches, the apparatus 
comprising: 

a cache miss rate monitor configured to determine the cache miss rates of the 
more than one processors when issuing cache requests against the private caches; 
a cache miss rate comparator configured to compare the cache miss rates; and 
a cache request forwarder configured to allocate cache lines from more than one 
of the private caches to a cache request of a processor of the more than one 
processors based upon the difference between the cache miss rate for the processor 
and the cache miss rates of other processors. 

14. The apparatus of claim 13, wherein the cache miss rate monitor comprises a 
plurality of counters, each configured to count cache misses of a corresponding one of 
the more than one processors. 

15. The apparatus of claim 13, wherein the cache request forwarder is adaptable to 
forward cache requests from the processor to a private cache associated with another 
processor. 

1 6. The apparatus of claim 1 3, wherein the cache request forwarder is adapted to 
selectively allocate cache lines based upon a priority associated with a cache request 
of the processor. 

1 7. The apparatus of claim 1 3, wherein the cache request forwarder comprises a 
least recently cache line table to determine which cache line to allocate for use with the 
processor. 
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1 8. An apparatus adapted to reduce the latency for accessing memory coupled 
thereto, comprising: 

more than one processors to issue cache requests; 

more than one private caches, each individually coupled with one of the more 
than one processors; 

a cache miss rate monitor to determine a cache miss rate with each of the more 
than one processors; 

a cache miss rate comparator to determine when at least one of the cache miss 
rates exceeds a threshold; and 

a cache request forwarder to forward a cache request from a processor of the 
more than one processors that is associated with a cache miss rate determined to 
exceed the threshold, to a private cache of the more than one private caches 
associated with another processor of the more than one processors. 

19. The apparatus of claim 18, wherein the more than one processors and the more 
than one private caches reside on more than one processor modules. 

20. The apparatus of claim 1 8, wherein the cache miss monitor comprises more than 
one cache miss counter, each coupled with one of the more than one processors, to 
start a count of cache misses after a cold start warm-up period. 

21 . The apparatus of claim 1 8, wherein the cache miss comparator comprises a rate 
averager to compare the cache miss rates to determine when the cache miss rate of 
the processor exceeds an average cache miss rate associated with the more than one 
processors. 

22. The apparatus of claim 18, wherein the cache request forwarder is responsive to 
a software instruction to forward cache requests from one of the more than one 
processors to the private cache. 
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23. The apparatus of claim 1 8, wherein the cache request forwarder is adapted to 
select the private cache based upon a least recently used cache line associated with 
the private caches. 

24. The apparatus of claim 23, wherein the cache request forwarder is adapted to 
select the private cache based upon a processor module on which the private cache 
resides. 

25. The apparatus of claim 1 7, wherein the cache request forwarder is adapted to 
select the cache request based upon a priority associated with the cache request. 

26. The apparatus of claim 17, wherein the cache request forwarder inserts the 
cache request into a cache request queue for the private cache to store the memory 
line in the private cache. 

27. The apparatus of claim 26, wherein the cache request forwarder comprises an 
arbitrator to arbitrate between the cache request and another cache request from 
another processor of the more than one processors, to forward the cache request to the 
cache request queue. 

28. A system, the system comprising: 

a processor module comprising a first processor coupled with a first private 
cache and a second processor coupled with a second private cache; 

a cache miss rate monitor to count cache misses associated with the first 
processor and the second processor; 

a cache miss rate comparator to compare the cache misses associated with the 
first processor against cache misses associated with the second processor; and 

a cache request forwarder to forward cache requests from the first processor to 

the second private cache when a number of cache misses associated with the first 
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processor, related to the first private cache, exceeds a number of cache misses 
associated with the second processor. 

29. The system of claim 28, further comprising a historical use file containing a set of 
one or more tasks and associated cache miss rate information. 

30. The system of claim 29, further comprising a software application to enable the 
cache request forwarder to forward the cache requests based upon the difference 
between the number of cache misses associated with the first processor and the 
number of cache misses associated with the second processor. 

31 . The system of claim 28, wherein the cache request forwarder allocates cache 
lines of the first private cache and the second private cache based upon the difference 
between the cache miss rates of the first processor and the second processor. 

32. The system of claim 28, wherein the cache request forwarder forwards cache 
requests from a first processor module of the more than one processor modules to a 
second processor module of the more than one processor modules, the second module 
having a least recently used cache line. 

33. A computer readable medium containing a program which, when executed, 
performs an operation, comprising: 

determining cache miss rates of more than one processors when issuing cache 
requests against one or more private caches; 
comparing the cache miss rates; and 

allocating cache lines from more than one of the private caches to a processor of 
the more than one processors based upon a difference between the cache miss rate for 
the processor and the cache miss rates of other processors. 
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34. The computer readable medium of claim 33, wherein allocating cache lines 
comprises forwarding cache requests from the processor to a private cache of the 
private caches, wherein the private cache is associated with another processor. 

35. The computer readable medium of claim 33, wherein allocating cache lines 
comprises selectively allocating cache lines based upon a priority associated with a 
cache request of the processor. 

36. A computer readable medium containing a program which, when executed, 
performs an operation, comprising: 

monitoring cache miss rates of more than one processors; 

comparing the cache miss rates of the more than one processors to determine 
when a cache miss rate of a first processor associated with a first private cache 
exceeds a threshold cache miss rate for the more than one processors; 

forwarding a cache request associated with the first processor to a second 
private cache in response to determining the cache miss rate exceeds the threshold 
cache miss rate; 

replacing a cache line in the second private cache with a memory line received 
in response to the cache request; and 

accessing the cache line in response to an instruction from the first processor. 

37. The computer readable medium of claim 36, wherein comparing the cache miss 
rates comprises comparing the cache miss rates, the cache miss rates being 
associated with more than one processor modules. 

38. The computer readable medium of claim 36, wherein the threshold cache miss 
rate is based upon an average cache miss rate for the more than one processors. 
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39. The computer readable medium of claim 36, wherein forwarding the cache 
request comprises selecting the second private cache based upon a least recently used 
cache line associated with the private caches. 

40. The computer readable medium of claim 39, wherein selecting the second 
private cache comprises selecting a least recently used cache line based upon a 
processor module on which the first processor resides. 

41 . The computer readable medium of claim 36, wherein forwarding the cache 
request comprises selecting the cache request based upon a priority associated with 
the cache request after the cache request misses in the first private cache. 

42. The computer readable medium of claim 36, wherein forwarding the cache 
request is responsive to a software instruction that overrides a result of comparing the 
cache miss rates to forward the cache request to the second private cache. 
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